A Probabilistic Neural Network Based Classification of Spam Mails Using Particle Swarm Optimization Feature Selection
نویسندگان
چکیده
Email has gained the explosive growth in the communication of people across the world. This worldwide communication also has some disadvantages like Spam mails. The spammers spread the useless, unwanted mails and even malicious contents to the usersemails. This increasing number of spam mails increases the need for the spam detection architecture with the machine learning classification. The proposed spam detection architecture composed of a feature selection process to minimize the error rate, a redundancy removing method and finally a classification system for categorizing the spam mails from the legitimate mails. The incoming mails are preprocessed by using the three traditional steps such as Tokenization, Stemming and the Stop Word Removal. The Vector Quantization (VQ) process is utilized to remove the redundancy in both the training and preprocessed data. Then the preprocessed redundancy removed training and testing data are given to the feature selector called the familiar Particle Swarm Optimization (PSO) algorithm which mines the optimal features suitable for the classification. Finally, along with the selected features, the Probabilistic Neural Network (PNN) classifies the spam mails from the legitimate mails with more accuracy and precision.
منابع مشابه
Email Spam Detection Using Combination of Particle Swarm Optimization and Artificial Neural Network and Support Vector Machine
The increasing use of e-mail in the world because of its simplicity and low cost, has led many Internet users are interested in developing their work in the context of the Internet. In the meantime, many of the natural or legal persons, to sending e-mails unrelated to mass. Hence, classification and identification of spam emails is very important. In this paper, the combined Particle Swarm Opti...
متن کاملPersian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network
Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملAccurate Fault Classification of Transmission Line Using Wavelet Transform and Probabilistic Neural Network
Fault classification in distance protection of transmission lines, with considering the wide variation in the fault operating conditions, has been very challenging task. This paper presents a probabilistic neural network (PNN) and new feature selection technique for fault classification in transmission lines. Initially, wavelet transform is used for feature extraction from half cycle of post-fa...
متن کامل